Version: 1.0.0

operations

Operations

1. groupBy

Syntax:  groupBy(fieldsArr, reducers)

parameters:
 - fieldsArr:
   required: true
   type: Array of string
   descriptions: Array containing the name of dimensions using which groupBy should happen.
 - reducers:
   required: false
   type: Array of Array,
   default: [],
   description : A simple array of (array pair) whose 0th index is the variable name and 1st
                 index is the name of the Aggregation function.

Groups the data using particular dimensions by reducing measures. It expects a list of dimensions using which it projects the DataModel and perform aggregations to reduce the duplicate tuples.

DataModel by default provides Aggregation Functions to aggregate grouped measure value.

Returns: DataModel (Returns a new DataModel instance after performing the groupBy)

const Datamodel = muze.DataModel;
const data = [
  {
    Maker: "chevrolet",
    Name: "chevrolet chevelle malibu",
    Miles_per_Gallon: 18,
    Cylinders: 8,
    Displacement: 307,
    Horsepower: 130,
    Weight_in_lbs: 3504,
    Acceleration: 12,
    Year: "1970-01-01",
    Origin: "USA",
  },
  {
    Maker: "buick",
    Name: "buick skylark 320",
    Miles_per_Gallon: 15,
    Cylinders: 8,
    Displacement: -350,
    Horsepower: 165,
    Weight_in_lbs: 3693,
    Acceleration: 11.5,
    Year: "1970-01-01",
    Origin: "USA",
  },
  // ... and so on...
];
const schema = [
  {
    name: "Name",
    type: "dimension",
  },
  {
    name: "Maker",
    type: "dimension",
  },
  {
    name: "Miles_per_Gallon",
    type: "measure",
    defAggFn: "avg",
  },
  {
    name: "Displacement",
    type: "measure",
    defAggFn: "sum",
  },
  {
    name: "Horsepower",
    type: "measure",
    defAggFn: "sum",
  },
  {
    name: "Weight_in_lbs",
    type: "measure",
    defAggFn: "min",
  },
  {
    name: "Acceleration",
    type: "measure",
    defAggFn: "sum",
  },
  {
    name: "Origin",
    type: "dimension",
  },
  {
    name: "Cylinders",
    type: "dimension",
  },
  {
    name: "Year",
    type: "dimension",
    subtype: "temporal",
    format: "%Y-%m-%d",
  },
];
const formattedData = await Datamodel.loadData(data, schema);
let dm = new Datamodel(formattedData);
const outputDM = dm.groupBy(
  ["Year"],
  ["Horsepower", Datamodel.AggregationFunctions.MAX],
);

Printing the output outputDM gives:

Year	Miles_per_Gallon	Displacement	Horsepower	Weight_in_lbs	Acceleration
-19800000	18	455	147.55882352941177	1835	12.544117647058824
3151620000	21.25	400	104.92857142857143	1613	15.310344827586206
6305220000	18.714285714285715	429	120.17857142857143	2100	15.125
9467460000	17.1	455	130.475	1867	14.3125
12621060000	22.703703703703702	350	94.23076923076923	1649	16.203703703703702

2. sort

Syntax:  sort(sortingDetails)

parameters:
 - sortingDetails:
   required: true
   type: Array of Array
   descriptions: Sorting details based on which the sorting will be performed.

Performs sorting according to the specified sorting details.Like every other operator it doesn't mutate the current DataModel instance on which it was called, instead returns a new DataModel instance containing the sorted data.

DataModel support multi level sorting by listing the variables using which sorting needs to be performed and the type of sorting ASC or DESC.

Returns: DataModel (Returns a new instance of DataModel with sorted data)

In the following example, data is sorted by Origin field in DESC order in first level followed by another level of sorting by Acceleration in ASC order.

const Datamodel = muze.DataModel;
const data = {
  //... Cars Data as shown in above example ...
};
const schema = [
  //... Cars Schema as shown in above example ...
];
const formattedData = await Datamodel.loadData(data, schema);
let dm = new Datamodel(formattedData);
const outputDM = dm.sort([
  ["Origin", "desc"],
  ["Acceleration"], // Default value is ASC
]);

Printing the outputDM gives:

Name	Maker	Miles_per_Gallon	Displacement	Horsepower	Weight_in_lbs	Acceleration	Origin	Cylinders	Year
plymouth 'cuda 340	plymouth	14	340	160	3609	8	USA	8	-19800000
ford mustang boss 302	ford	NaN	302	140	3353	8	USA	8	-19800000
plymouth fury iii	plymouth	14	440	215	4312	8.5	USA	8	-19800000
amc ambassador dpl	amc	15	390	190	3850	8.5	USA	8	-19800000
chevrolet impala	chevrolet	14	454	220	4354	9	USA	8	-19800000

3. calculateVariable

Syntax:  calculateVariable(params)

parameters:
 - schema:
   required: true
   type: JSON
   description : Schema of newly defined variable
 - fieldName(s):
   required: true
   type: Array
   description : Array of previous schema variable names
 - resolverFunction:
   required: true
   type: Array
   description : A function to define, how every value of the new field is generated by each
                 values of the field names, ie the previous params

Returns: DataModel (Instance of DataModel with the new field)

Creates a new variable calculated from existing variable. This method expects definition of the newly created variable and a function which resolves value of the new variable from existing variables.

Creates a new measure based on existing variables:

Example 1

const Datamodel = muze.DataModel;
const data = {
  //... Cars Data as shown in above example ...
};
const schema = [
  {
    name: "Origin",
    type: "dimension",
  },
  {
    name: "Cylinders",
    type: "dimension",
  },
  {
    name: "Horsepower",
    type: "measure",
    defAggFn: "avg",
  },
  {
    name: "Weight_in_lbs",
    type: "measure",
    defAggFn: "min",
  },
];

const formattedData = await Datamodel.loadData(data, schema);
let dm = new Datamodel(formattedData);
const outputDM = dm.calculateVariable(
  {
    name: "powerToWeight",
    type: "measure", // Schema of variable
  },
  ["Horsepower", "Weight_in_lbs"],
  (hp, weight) => hp / weight,
);

Original DataModel

Origin	Cylinders	Horsepower	Weight_in_lbs
USA	8	130	3504
USA	8	165	3693
USA	8	150	3436
USA	8	150	3433
USA	8	140	3449

New DataModel

Origin	Cylinders	Horsepower	Weight_in_lbs	powerToWeight
USA	8	130	3504	0.037100456621004564
USA	8	165	3693	0.04467912266450041
USA	8	150	3436	0.043655413271245634
USA	8	150	3433	0.043693562481794346
USA	8	140	3449	0.0405914757900840

Example 2

const Datamodel = muze.DataModel;
const data = {
  //... Cars Data as shown in above example ...
};
const formattedData = await Datamodel.loadData(data, schema);
let dm = new Datamodel(formattedData);
const outputDM = dm.calculateVariable(
  {
    name: "Efficiency",
    type: "dimension",
  },
  ["Horsepower"],
  (hp) => {
    if (hp < 80) {
      return "low";
    } else if (hp < 120) {
      return "moderate";
    } else {
      return "high";
    }
  },
);

Printing outputDM gives:

Name	Maker	Miles_per_Gallon	Displacement	Horsepower	Weight_in_lbs	Acceleration	Origin	Cylinders	Year	Efficiency
chevrolet chevelle malibu	chevrolet	18	307	130	3504	12	USA	8	-19800000	high
buick skylark 320	buick	15	350	165	3693	11.5	USA	8	-19800000	high
plymouth satellite	plymouth	18	318	150	3436	11	USA	8	-19800000	high
amc rebel sst	amc	16	304	150	3433	12	USA	8	-19800000	high
ford torino	ford	17	302	140	3449	10.5	USA	8	-19800000	high

4. select

Syntax: select(conditions, config)

parameters:
 - conditions:
   required: true
   type: Object
   descriptions: An Object to govern the selection of values from the data
   default: {}
   parameters: SelectionParameters
 - config:
   required: false
   type: Object
      default: {}
   description : The configuration object to control the inclusion exclusion of a row in resultant DataModel instance.
      parameters:
     - mode:
       required: true
       type: FilteringModes
       descriptions: The mode of the selection

Comparison Operations

parameters:
  - field:
    required: true
    type: string
    descriptions: The field name to compare
  - operator:
      required: true
      type: ComparisonOperator
      descriptions: The comparison operation to be done
  - value:
      required: true
      type: SupportedDataTypes | array<SupportedDataTypes>
      descriptions: The value to be compared with

Logical Operations

parameters:
  - operator:
      required: true
      type: LogicalOperator
      descriptions: The Operation with with all the Comparison Operations are connected
  - conditions:
      required: true
      type: array<ComparisonOperations>
      descriptions: the lis of comparison operations

SupportedDataTypes: string | number | null | undefined

FilteringModes operates on the selection and rejection set to determine which one would reflect in the resultant datamodel. The Filtering modes are:

INVERSE
NORMAL
ALL

Note: Selection and rejection set is only a logical idea for concept explanation purpose.

Returns: DataModel (Returns an instance of DataModel with selected data according to the field names)

Example 1

const Datamodel = muze.DataModel;
const data = {
  //... Cars Data as shown in above example ...
};
const schema = [
  {
    name: "Name",
    type: "dimension",
  },
  {
    name: "Origin",
    type: "dimension",
  },
  {
    name: "Cylinders",
    type: "dimension",
  },
];

const formattedData = await Datamodel.loadData(data, schema);
const dm = new Datamodel(formattedData);
const { EQUAL } = Datamodel.ComparisonOperators;
const { AND, OR } = Datamodel.LogicalOperators;
const selectedDM = dm.select({
  field: "Origin",
  value: "Japan",
  operator: EQUAL,
});

Printing selectedDM gives:

Name	Origin	Cylinders
toyota corona mark ii	Japan	4
datsun pl510	Japan	4
datsun pl510	Japan	4
toyota corona	Japan	4
toyota corolla 1200	Japan	4

Example 2

const Datamodel = muze.DataModel;
const data = {
  //... Cars Data as shown in above example ...
};
const schema = [
  {
    name: "Name",
    type: "dimension",
  },
  {
    name: "Origin",
    type: "dimension",
  },
  {
    name: "Cylinders",
    type: "dimension",
  },
];

const formattedData = await Datamodel.loadData(data, schema);
const dm = new Datamodel(formattedData);
const { EQUAL } = Datamodel.ComparisonOperators;
const { AND, OR } = Datamodel.LogicalOperators;
const selectedDM = dm.select({
  conditions: [
    { field: "Origin", value: "Japan", operator: EQUAL },
    {
      conditions: [
        { field: "Cylinders", value: "3", operator: EQUAL },
        { field: "Cylinders", value: "6", operator: EQUAL },
        { field: "Cylinders", value: "8", operator: EQUAL },
      ],
      operator: OR,
    },
  ],
  operator: AND,
});

Printing selectedDM gives:

Name	Origin	Cylinders
mazda rx2 coupe	Japan	3
maxda rx3	Japan	3
toyota mark ii	Japan	6
toyota mark ii	Japan	6
datsun 810	Japan	6

5. project

Syntax: project(projField, config)

parameters:
 - projField:
   required: true
   type: Array<(string|Regexp)>
   descriptions: An array of column names in string or regular expression.
 - config:
   required: false
   type: Object
      default: {}
   description : The configuration object to control the creation of new DataModel.
      parameters:
     - mode:
       required: true
       type: FilteringModes
       descriptions: Mode of the projection

This is functional version of projection operator. Projection is a column (field) filtering operation. It expects list of fields name and either include those or exclude those based on FilteringMode on the resultant dataModel. It returns a function which is called with the DataModel instance on which the action needs to be performed.

Projection expects array of fields name based on which it creates the selection and rejection set. All the field whose name is present in array goes in selection set and rest of the fields goes in rejection set.

FilteringModes operates on the selection and rejection set to determine which one would reflect in the resultant datamodel.

Note: Selection and rejection set is only a logical idea for concept explanation purpose.

Returns: DataModel (Returns an instance of DataModel with project data according to the field names)

Example:

const Datamodel = muze.DataModel;
const data = {
  //... Cars Data as shown in above example ...
};
const schema = [
  {
    name: "Name",
    type: "dimension",
  },
  {
    name: "Origin",
    type: "dimension",
  },
  {
    name: "Cylinders",
    type: "dimension",
  },
];

const formattedData = await Datamodel.loadData(data, schema);
const dm = new Datamodel(formattedData);
outputDM = dm.project(["Name"], { mode: Datamodel.FilteringModes.INVERSE });

Printing outputDM gives:

Origin	Cylinders
USA	8
USA	8
USA	8
USA	8
USA	8

6. splitByRow

Syntax: splitByRow(fields)

parameters:
 - fieldNames:
   required: true
   type: Array<(string)>
   descriptions: An array of column names in string.

Returns: DataModel (Returns an array of instances of DataModel with split according to the field names)

Example 1 This is the method that is used to split into groups of unique combinations of the values of the fields. For example : If for the cars data we have been using in this section, if we split by Origin, it will be split into an array of three datamodels, each for USA, Japan and European Union

const Datamodel = muze.DataModel;
const data = {
  //... Cars Data as shown in above example ...
};
const schema = [
  {
    name: "Name",
    type: "dimension",
  },
  {
    name: "Acceleration",
    type: "measure",
    defAggFn: "avg",
  },
  {
    name: "Origin",
    type: "dimension",
  },
  {
    name: "Cylinders",
    type: "dimension",
  },
];

const formattedData = await Datamodel.loadData(data, schema);
const dm = new Datamodel(formattedData);
const outputDm = dm.splitByRow(["Origin"]);
for (let i = 0; i < outputDm.length; i++) {
  printDM(outputDm[i]);
}

Note: printDM is a utility function to render datamodel on a webpage for demonstration putpose only.

The output gives 3 datamodels as shown below:

Name	Acceleration	Origin	Cylinders
citroen ds-21 pallas	17.5	Europe	4
volkswagen 1131 deluxe sedan	20.5	Europe	4
peugeot 504	17.5	Europe	4
audi 100 ls	14.5	Europe	4
saab 99e	17.5	Europe	4

Name	Acceleration	Origin	Cylinders
chevrolet chevelle malibu	12	USA	8
buick skylark 320	11.5	USA	8
plymouth satellite	11	USA	8
amc rebel sst	12	USA	8
ford torino	10.5	USA	8

Name	Acceleration	Origin	Cylinders
toyota corona mark ii	15	Japan	4
datsun pl510	14.5	Japan	4
datsun pl510	14.5	Japan	4
toyota corona	14	Japan	4
toyota corolla 1200	19	Japan	4

Example 2

Similarly if the data can me split by unique combinations of more than one field values. Considering, the following sample, when the data id split by Origin and Cylinders, the number of data models generated is 9 because the unique groups formed by every values of Origin and Cylinders is 9:

const Datamodel = muze.DataModel;
const data = {
  //... Cars Data as shown in above example ...
};
const schema = [
  {
    name: "Name",
    type: "dimension",
  },
  {
    name: "Acceleration",
    type: "measure",
    defAggFn: "avg",
  },
  {
    name: "Origin",
    type: "dimension",
  },
  {
    name: "Cylinders",
    type: "dimension",
  },
];

const formattedData = await Datamodel.loadData(data, schema);
const dm = new Datamodel(formattedData);
const outputDm = dm.splitByRow(["Origin", "Cylinders"]);
for (let i = 0; i < outputDm.length; i++) {
  printTable(
    outputDm[i].getData().data,
    ["Name", "Acceleration", "Origin", "Cylinders"],
    { rowLimit: 1 },
  );
}

Note: printTable is a utility function to render datamodel on a webpage for demonstration putpose only.

Name	Acceleration	Origin	Cylinders
chevrolet chevelle malibu	12	USA	8

Name	Acceleration	Origin	Cylinders
audi 5000	15.9	Europe	5

Name	Acceleration	Origin	Cylinders
mercedes-benz 280s	16.7	Europe	6

Name	Acceleration	Origin	Cylinders
toyota mark ii	13.5	Japan	6

Name	Acceleration	Origin	Cylinders
mazda rx2 coupe	13.5	Japan	3

Name	Acceleration	Origin	Cylinders
chevrolet vega 2300	15.5	USA	4

Name	Acceleration	Origin	Cylinders
plymouth duster	15.5	USA	6

Name	Acceleration	Origin	Cylinders
toyota corona mark ii	15	Japan	4

Name	Acceleration	Origin	Cylinders
citroen ds-21 pallas	17.5	Europe	4

Operations​

Operations